ClearTK 2.0: Design Patterns for Machine Learning in UIMA
نویسندگان
چکیده
ClearTK adds machine learning functionality to the UIMA framework, providing wrappers to popular machine learning libraries, a rich feature extraction library that works across different classifiers, and utilities for applying and evaluating machine learning models. Since its inception in 2008, ClearTK has evolved in response to feedback from developers and the community. This evolution has followed a number of important design principles including: conceptually simple annotator interfaces, readable pipeline descriptions, minimal collection readers, type system agnostic code, modules organized for ease of import, and assisting user comprehension of the complex UIMA framework.
منابع مشابه
The CLaC Discourse Parser at CoNLL-2015
This paper describes our submission (kosseim15) to the CoNLL-2015 shared task on shallow discourse parsing. We used the UIMA framework to develop our parser and used ClearTK to add machine learning functionality to the UIMA framework. Overall, our parser achieves a result of 17.3 F1 on the identification of discourse relations on the blind CoNLL-2015 test set, ranking in sixth place.
متن کاملBuilding Test Suites for UIMA Components
We summarize our experiences building a comprehensive suite of tests for a statistical natural language processing toolkit, ClearTK. We describe some of the challenges we encountered, introduce a software project that emerged from these efforts, summarize our resulting test suite, and discuss some of the les-
متن کاملClearTK-TimeML: A minimalist approach to TempEval 2013
The ClearTK-TimeML submission to TempEval 2013 competed in all English tasks: identifying events, identifying times, and identifying temporal relations. The system is a pipeline of machine-learning models, each with a small set of features from a simple morpho-syntactic annotation pipeline, and where temporal relations are only predicted for a small set of syntactic constructions and relation t...
متن کاملCFE - A System for Testing, Evaluation and Machine Learning of UIMA Based Applications
There is a vast quantity of information available in unstructured form, and the academic and scientific communities are increasingly looking into new techniques for extracting key elements finding the structure in the unstructured. There are various ways to identify and extract this type of data; one leading system, which we will focus on, is the UIMA framework. Tasks that are often desirable t...
متن کاملCombination of Rule-based and Machine Learning for Biomedical Event Extraction
This paper describes the method for biomedical event extraction. The biomedical events occurs in relative to biomedical concepts (objects) as proteins, genes. In this work, we try a hybrid method to identify given event types relative to a given set of proteins in biomedical text. The approach combines rule-based and machine learning. A Set of rules is built based on event triggers, and a set o...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- LREC ... International Conference on Language Resources & Evaluation : [proceedings]. International Conference on Language Resources and Evaluation
دوره 2014 شماره
صفحات -
تاریخ انتشار 2014